prediction performance
A Composite Activation Function for Learning Stable Binary Representations
Park, Seokhun, Kim, Choeun, Lee, Kwanho, Park, Sehyun, Kong, Insung, Kim, Yongdai
Activation functions play a central role in neural networks by shaping internal representations. Recently, learning binary activation representations has attracted significant attention due to their advantages in computational and memory efficiency, as well as interpretability. However, training neural networks with Heaviside activations remains challenging, as their non-differentiability obstructs standard gradient-based optimization. In this paper, we propose Heavy Tailed Activation Function (HTAF), a smooth approximation to the Heaviside function that enables stable training with gradient-based optimization. We construct HTAF as a sigmoid hyperbolic tangent composite function and theoretically show that it maintains a large gradient mass around zero inputs while exhibiting slower gradient decay in the tail regions. We show that Spiking Neural Networks, Binary Neural Networks and Deep Heaviside neural Networks can be trained stably using HTAF with gradient-based optimization. Finally, we introduce Implicit Concept Bottleneck Models (ICBMs), an interpretable image model that leverages HTAF to induce discrete feature representations. Extensive experiments across various architectures and image datasets demonstrate that ICBM enables stable discretization while achieving prediction performance comparable to or better than standard models.
e197fe307eb3467035f892dc100d570a-Supplemental-Conference.pdf
In addition to the radar plot, we present the specific numerical values for the prediction and driving performance metrics to provide a more detailed and comprehensive analysis of the system's performance, as demonstrated in Table 1. The static evaluation metrics, ADE and FDE, are trained and validated on the Alignment dataset collected from the SUMMIT simulator. The task-driven evaluation metrics, including safety, efficiency, comfort, and driving performance, are derived from interactive closed-loop scenarios. The process for calculating these metrics is described in Appendix C. Results in Table 1 are used to plot the correlation map between ADE/FDE and driving performance, which surprisingly indicates no strong correlation between static evaluation metrics and real driving performance. Moreover, to ensure the comparability between prediction performance metrics and driving performance metrics in the radar plot, we normalize all metrics to the scale of [0, 1]. B.1 The RVOPlanner The Reciprocal Velocity Obstacle (RVO) planner is developed based on [8], which expands on the concept of velocity obstacles [4] to consider the reactive behaviors of exo-agents.
Brains on Beats
Umut Güçlü, Jordy Thielen, Michael Hanke, Marcel van Gerven, Marcel A. J. van Gerven
We developed task-optimized deep neural networks (DNNs) that achieved state-ofthe-art performance in different evaluation scenarios for automatic music tagging. These DNNs were subsequently used to probe the neural representations of music. Representational similarity analysis revealed the existence of a representational gradient across the superior temporal gyrus (STG). Anterior STG was shown to be more sensitive to low-level stimulus features encoded in shallow DNN layers whereas posterior STG was shown to be more sensitive to high-level stimulus features encoded in deep DNN layers.
Interpretable Prototype-based Graph Information Bottleneck
The success of Graph Neural Networks (GNNs) has led to a need for understanding their decision-making process and providing explanations for their predictions, which has given rise to explainable AI (XAI) that offers transparent explanations for black-box models. Recently, the use of prototypes has successfully improved the explainability of models by learning prototypes to imply training graphs that affect the prediction. However, these approaches tend to provide prototypes with excessive information from the entire graph, leading to the exclusion of key substructures or the inclusion of irrelevant substructures, which can limit both the interpretability and the performance of the model in downstream tasks. In this work, we propose a novel framework of explainable GNNs, called interpretable Prototype-based Graph Information Bottleneck (PGIB) that incorporates prototype learning within the information bottleneck framework to provide prototypes with the key subgraph from the input graph that is important for the model prediction. This is the first work that incorporates prototype learning into the process of identifying the key subgraphs that have a critical impact on the prediction performance. Extensive experiments, including qualitative analysis, demonstrate that PGIB outperforms state-of-the-art methods in terms of both prediction performance and explainability.